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Abstract 

Background: The chemical composition of aerosols and particle size distributions are the most significant factors 
affecting air quality. In particular, the exposure to finer particles can cause short and long-term effects on human 
health. In the present paper PM 10 (particulate matter with aerodynamic diameter lower than 10 |um), CO, N0 X (NO 
and N0 2 ), Benzene and Toluene trends monitored in six monitoring stations of Bari province are shown. The data 
set used was composed by bi-hourly means for all parameters (12 bi-hourly means per day for each parameter) and 
it's referred to the period of time from January 2005 and May 2007. The main aim of the paper is to provide a clear 
illustration of how large data sets from monitoring stations can give information about the number and nature of 
the pollutant sources, and mainly to assess the contribution of the traffic source to PM 10 concentration level by 
using multivariate statistical techniques such as Principal Component Analysis (PCA) and Absolute Principal 
Component Scores (APCS). 

Results: Comparing the night and day mean concentrations (per day) for each parameter it has been pointed out 
that there is a different night and day behavior for some parameters such as CO, Benzene and Toluene than PM 10 . 
This suggests that CO, Benzene and Toluene concentrations are mainly connected with transport systems, whereas 
PM 10 is mostly influenced by different factors. 

The statistical techniques identified three recurrent sources, associated with vehicular traffic and particulate 
transport, covering over 90% of variance. The contemporaneous analysis of gas and PM 10 has allowed underlining 
the differences between the sources of these pollutants. 

Conclusions: The analysis of the pollutant trends from large data set and the application of multivariate statistical 
techniques such as PCA and APCS can give useful information about air quality and pollutant's sources. These 
knowledge can provide useful advices to environmental policies in order to reach the WHO recommended levels. 
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Background 

The knowledge of chemical composition and sources of 
air polluted is demanded in any program aimed at con- 
trolling the levels of pollutants in order to evaluate and 
reduce their impact on human health. 

The inhalation of air polluted, in fact, with particulate 
matter (PM 10 ) and or irritant gases such as N0 2 and 
S0 2 is associated with both short-term and long term 
health effects, most of which impact on respiratory and 
cardiovascular system [1]. For example the atmospheric 
concentrations of N0 2 have been linked to the deaths 
of severely asthmatic patients in Barcelona [2], child 
asthma cases in Toronto and Southern California [3,4], 
heart rate dysfunction in Taiwan and Switzerland [5,6], 
and ischemic heart disease in elderly residents of French 
cities [7]. Similar examples can be chosen to illustrate 
the damaging effects of PM 10 inhalation, whether it be 
asthma in Madrid or Sydney [8,9] or all-cause mortality 
(especially stroke) in Boston [10]. 

The federal Clean Air Act Amendments of 1990 mandate 
that the U.S. EPA determine a set of urban hazardous air 
pollutants (PAHs, or air toxics') that potentially pose the 
greatest risks in urban areas, in terms of contribution to 
population health risk. The current set of 188 PAHs in- 
cludes toxic metals and volatile organic compounds 
(VOCs). The U.S. EPA identified 33 urban PAHs based on 
emissions and toxicities in a 1995 ranking analysis [11] 
and developed concurrent monitoring and modelling pro- 
grams to evaluate potential exposures and risks to these 
top-ranked 33 PAHs. Developing effective control strategies 
to reduce population exposure to certain PAHs requires 
identifying sources and quantifying their contributions to 
the mixture of PAHs and the associated health risks. One 
approach is to use receptor-based source apportionment 
models to distinguish sources. Most source apportion- 
ment studies focus on analysing either VOCs [12,13] or 
fine particle (PM 2 . 5 ) mass [14-16]. Only few studies used 
source apportionment modelling to identify common 
sources of both VOCs and PM 2<5 . In other source appor- 
tionment studies that included both non-organic trace 
elements on PM and gaseous pollutants [17-20], the 
gaseous species usually were non-VOCs (such as CO, 
S0 2 , and NO). 

In recent years, there has been an increased interest in 
the application of chemometrics [21] to different envir- 
onmental research fields, ranging from water to air pol- 
lution and cultural heritage [22-25]. One aspect of the 
application of chemometrics to environmental pollution 
research is often referred to as source apportionment, 
receptor modelling and/or mixture analysis discipline. 
Recent examples of such work can be found in Europe 
[26,27], the US [28,29] and Asia [30,31]. In the fields of 
pollution sciences (air or water), source apportionment 
models aim to re-construct the emissions from different 



sources of pollutants based on ambient data registered 
at monitoring sites [32]. 

In the present paper a bihourly data set of PM 10 , CO, 
NO x , Benzene and Toluene collected in six air quality 
monitoring stations of Bari territory from January 2005 
to May 2007 is used. The main aim of this paper is to 
provide a clear illustration of how large data sets from 
monitoring stations can give information about the 
number and nature of the pollutant sources, and mainly 
to assess the contribution of the traffic source to PM 10 
concentration level by using multivariate statistical 
techniques. 

These knowledge could provide useful advices to envir- 
onmental policies in order to reach the WHO recom- 
mended levels. In fact legislative efforts to reduce the 
health effects of air pollutants are currently being applied 
throughout the developed world, with the imposition of 
averaged limit values which vary for different pollutants. 
In the case of PM 10 , the World Health Organization has 
recommended progressive achievement of four pollution 
thresholds which cascade down through three Interim 
Targets (IT1 % 70 [ig/m 3 ; IT2 % 50 [ig/m 3 ; IT3 % 30 (ig/m 3 ) 
to reach the ultimate objective: an Air Quality Guideline 
(AQG) annual mean of just 20 \xg m 3 PM 10 [33,34]. More- 
over considering the latest Italian law [35,36] for PM 10 the 
annual limit value is 40 (ig/m 3 , while the daily limit value 
is 50 (ig/m 3 ; for NO x the annual limit value is 40 (ig/m 3 , 
while hourly limit value is 200 (ig/m 3 ; for Benzene the an- 
nual limit value is 5 (ig/m 3 and for CO the 8 hour mean 
limit value is 10 mg/m 3 . 

Results and discussion 

In the Table 1 the basic statistics for each site have been 
summarized. Among all the available sampling sites, 
only those with the number of data not less than 5000 
were used, considering only days with complete data (12 
daily data). High variability is explained by the long 
range of the period (2.5 years). Pollutants concentrations 
were reported as (ig/m 3 , except for CO which is expressed 
as mg/m 3 . 

From data collected, night and daily mean concentra- 
tions (per day) for each parameter have been obtained. 
Night and daily mean values have been plotted for each 
parameter and graphics, as Figures 1, 2, 3 and 4 shown, 
have been obtained for each sampling site. 

Observing the Figures 1, 2, 3 and 4 shown as example, 
parameters such as CO, Benzene and Toluene show dif- 
ferent trend between night and daily values, whit daily 
mean values bigger than night ones. In particular for the 
data shown in Figures 1, 2 and 3 the percentage ratio be- 
tween (daily mean - night mean) and daily mean for CO, 
Benzene and Toluene is 53%, 49%, 54% respectively. 
Considering Toluene trend shown in Figure 3 it is pos- 
sible to note for some days, e. g. 05/05/2005 or 22/02/ 
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Table 1 Descriptive statistics for each parameter in each site 



Viale Archimede, Bari (urban site) San Nicola sport stadium, Bari (suburban site) Viale King, Bari (urban site) 





Data 


Mean 


STD. DEV. 


Data 


Mean 


STD. DEV. 


Data 


Mean 


STD. DEV. 


NOx 


7456 


42.15 


41.69 


6648 


24.65 


32.82 


7008 


61.64 


57.88 


Benzene 


7456 


1.49 


1.38 


6648 


1.08 


1.48 


7008 


1.56 


1.62 


Toluene 


7456 


4.18 


4.07 


6648 


2.40 


4.02 


7008 


4.97 


7.46 




7456 


0.45 


0.35 


6648 


0.44 


0.18 


7008 


0.58 


0.41 


PMW 


7456 


29.71 


14.36 


6648 


31.59 


20.26 


7008 


25.66 


16.08 




Altamura, prov. of Bari (urban site) 


Andria, prov. of Bari (urban site) 


Monopoli, prov. of Bari (urban site) 




Data 


Mean 


STD. DEV. 


Data 


Mean 


STD. DEV. 


Data 


Mean 


STD. DEV. 


NOx 


5136 


46.12 


40.08 


5568 


43.89 


42.79 


5244 


50.65 


47.60 


Benzene 


5136 


1.83 


1.62 


5568 


1.66 


0.95 


5244 


1.98 


1.06 


Toluene 


5136 


6.74 


9.27 


5568 


4.72 


4.47 


5244 


4.98 


4.18 


CO 


5136 


0.43 


0.32 


5568 


0.77 


0.73 


5244 


0.46 


0.34 


PMW 


5136 


27.83 


20.83 


5568 


22.00 


14.63 


5244 


36.11 


21.91 



2006, very high daily mean values on the contrary of 
Benzene ones shown in Figure 2. The reason is due to 
the presence of another pollution source affecting the 
monitoring site, probably identifiable in the painting of 
pedestrian crossing and road stripes. 

Considering the PM 10 night and dilay mean concentra- 
tions (Figure 4) its possible to note that they don't show 
a clear difference between day and night: in fact the ratio 
for PM 10 is 16%. Moreover for some days, e. g. 25/03/ 
2005 and 06/02/2006, the thermodynamic conditions in 
the planetary boundary layer (PBL) adversely affect pol- 
lutants dispersion leading to PM 10 night values bigger 
than daily ones, in spite of emission sources reduction 
during the night. 

The different night and daily behavior suggests that 
parameters such as CO, Benzene and Toluene are mainly 
connected with transport systems, whereas PM 10 is mostly 
influenced by different factors. 

The parameters trends shown in Figures 1, 2, 3 and 4, 
related to Viale Archimede data, are similar to ones of 



the other sites. So the different behaviour between PM 10 
and the other parameters (CO, Benzene, Toluene) can 
be considered common to the whole area investigated: 
Bari and Bari province. 

Moreover, as we have shown in a previous papers [37], 
the results obtained both by automatic monitoring sta- 
tions and sampling campaigns in several sites of Apulia 
region, suggest that the PM 10 amount monitored in this 
area presents a common contribution also among moni- 
toring stations located at 70 km far each other: the com- 
mon contribution apparently does not depend from 
local sources. Moreover in the reference 37 we pointed 
out that PM 10 concentrations do not show a seasonal 
trend, contrary to the PM 10 trend shown in the towns of 
North Italy [38,39]. 

In order to identify the pollutant sources that contrib- 
ute to PM 10 concentrations and try to distinguish the 
contribution of local sources, such as vehicular traffic, as 
respect to "a common regional source" (that is re- 
suspended matter, dust intrusions, calcium carbonate 
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Figure 2 Benzene concentrations daily and night trend for all the data collected in Viale Archimede. 



source), APCS model has been applied to the data 
collected. 

According to the criteria described in the methods sec- 
tion we have chosen the ODV90 one, revealing that three 
components are necessary and sufficient to run properly 
the model 

In Table 2 the loadings values for the PC analysis ap- 
plied to the data collected in all the sites during January 
2007 are shows as example. Three factors explain almost 
the 92% of the total variance of data for all the sites. 
Factor loadings are used to obtain information about 
sources profiles. The first factor (or first principal com- 
ponent, PCI) accounting for a percentage of the total 
variance ranging between 40% and 51% was dominated 
by high loading values of Benzene, Toluene and CO, or 
by NO x and CO depending on the sites; the second fac- 
tor (or second principal component, PC2), accounting 
for a percentage ranging between 24% and 31% of the 
total variance, is dominated by PM 10 or by Benzene and 
Toluene, while the third factor explaining a percentage 
ranging between 21% and 25% of the total variance had 
high loadings values for Benzene and Toluene or PM 10 . 



Applying PCA on all data set generally we found that 
for each sampling site one of the three factors is charac- 
terized by high loading values of PM 10 , the other two 
factors are characterized by high loading values of NO x , 
CO, Benzene and Toluene. 

Observing Figure 5 its possible to note that PM 10 is 
the dominant parameter on the second component with 
high loading values. 

In order to identify the three sources the Absolute 
Principal Component Scores model has been applied to 
data sets. In the Tables 3 and 4 the source s profiles for each 
monitoring station are shown as example. The source s pro- 
files shown are the average, obtained during the Summer 
2006 and Winter 2006, of the monthly profiles. 

Observing Table 3 and 4 that show the parameters distri- 
bution in the three pollution sources, averaged on the 
whole monitoring period, one can see that the profile of the 
second source is mostly characterized by PM 10 . The other 
two sources are differently characterized by NO x , Benzene, 
Toluene, CO and for a little contribution by PM 10 . 

Moreover comparing the sources profile concentra- 
tions between Summer and Winter seasons its possible 
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Figure 3 Toluene concentrations daily and night trend for all the data collected in Viale Archimede. 
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Figure 4 PM10 concentrations daily and night trend for all the data collected in Viale Archimede. 



to note a constant increasing of NO x concentration from 
Summer to Winter for all sites and sources. In particular 
the first source shows for all sites bigger NO x concentra- 
tions in the Winter than Summer ones. The first source 
can be considered a mixed source between vehicular 
traffic and domestic heating. 

In Figure 6 the percentage distribution of the parame- 
ters in the three sources is represented. The plot is ob- 
tained from monthly sources profile averaged for all 
sampling period of time and among all monitoring sites. 

Over 85% percent of the mass of PM 10 is attributed to 
the second source. The first and third sources, com- 
posed by NO x , CO and aromatic compounds, and low 
level of PM 10 , are characterized by similar level of 



benzene and toluene. In particular the Toluene and Ben- 
zene concentrations ratio in the first and third sources 
profiles are bigger than 2 (except for San Nicola sport 
stadium monitoring site): in literature this value is asso- 
ciated to vehicular traffic emissions. Moreover NO x and 
CO are predominant in the first source. The amount of 
PM 10 in the third source, even if low, is 50% higher than 
first source. 

These observations suggest that the second source 
could be identified as "Particulate source", while the first 
and third sources can be considered different compo- 
nents of vehicular traffic emissions. In fact, no industrial 
plants or similar are located close the sampling sites, 
and the traffic is the most important source of pollution 



Table 2 Principal Component Analysis loadings for all the sites investigated during January 2007 



Viale Archimede, Bari 
(urban site) 



San Nicola sport stadium, Bari 
(suburban site) 



Viale King, Bari (urban site) 





PCI 


PC2 


PC3 


PCI 


PC2 


PC3 


PCI 


PC2 


PC3 


NOx 


0.14 


0.84 


0.01 


0.90 


0.02 


0.02 


0.09 


0.84 


0.04 


Benzene 


0.75 


0.06 


0.13 


0.07 


0.77 


0.13 


0.70 


0.14 


0.14 


Toluene 


0.75 


0.09 


0.09 


0.53 


0.18 


0.12 


0.84 


0.09 


0.04 


CO 


0.67 


0.23 


0.00 


0.48 


0.29 


0.10 


0.30 


0.43 


0.15 


PMW 


0.06 


0.01 


0.93 


0.06 


0.12 


0.82 


0.08 


0.05 


0.86 


Eigenvalues 


2.37 


1.22 


1.17 


2.04 


1.37 


1.19 


2.02 


1.55 


1.23 


% variance explained 


47.3 


24.5 


23.4 


40.9 


27.5 


23.9 


40.5 


31.1 


24.7 




Altamura, prov. of Bari (urban site) 


Andria, prov. of Bari (urban site) 


Monopoli, prov. of Bari (urban site) 




PCI 


PC2 


PC3 


PCI 


PC2 


PC3 


PCI 


PC2 


PC3 


NOx 


0.84 


0.03 


0.11 


0.87 


0.11 


0.01 


0.91 


0.00 


0.03 


Benzene 


0.22 


0.10 


0.66 


0.33 


0.47 


0.17 


0.68 


0.19 


0.05 


Toluene 


0.34 


0.27 


0.31 


0.24 


0.61 


0.13 


0.05 


0.01 


0.95 


CO 


0.89 


0.00 


0.09 


0.86 


0.12 


0.01 


0.90 


0.03 


0.02 


PMW 


0.01 


0.94 


0.05 


0.01 


0.06 


0.93 


0.02 


0.96 


0.01 


Eigenvalues 


2.30 


1.34 


1.22 


2.31 


1.38 


1.24 


2.55 


1.19 


1.05 


% variance explained 


46.0 


26.8 


24.5 


46.1 


27.6 


24.8 


51.1 


23.9 


21.0 
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Figure 5 Scores and Loading plots in the plane of first and second Principal Component (PCI and PC2) obtained for the data set of 
parameters collected during January 2007 in viale Archimede (a), S. Nicola sport stadium (b), viale King (c), Altamura (d), Andria (e) and 
Monopoli (f) sites. 



of anthropic nature. The two traffic sources might be 
originated by different kinds of vehicles or engines, for 
example gas and diesel. These different fuels are known 
to be responsible of different emission of pollutants. In 
particular diesel, before the introduction of filters, was 
the major source of particulate matter among the several 
fuels used for road transport, with lower emissions of 
NO x and CO. Considering also the constant increasing 
of NO x concentration from Summer to Winter for all 
sites and sources (Tables 3 and 4) the first source could 
be identified with a mixed source between vehicular 



traffic and domestic heating, while the third source with 
vehicular traffic. 

Another proof linking the first and third sources to ve- 
hicular emissions is the daily profile of bihourly mean con- 
centrations contributions of the three sources (Figure 7). 
In Figure 7 its clearly showed that the particulate source 
shows a rather constant trend during the day and it is un- 
corrected with the traffic sources. The other two sources 
show, instead, a typical traffic profile, with peaks of emis- 
sion at 8 in the morning and 20 in the evening, in corres- 
pondence of rush hours of people going back and forth 



lelpo et al. Chemistry Central Journal 2014, 8:14 
http://journal.chemistrycentral.eom/content/8/1 /1 4 



Page 7 of 1 2 



Table 3 APCS source's profiles for the data collected during the Summer season 2006 in all the monitoring stations 
investigated 





Viale Archimede, Bari (urban site) 


San Nicola sport stadium, Bari (suburban site) 


Viale King, Bari (urban site) 




1 Source 


II Source 


III Source 


1 Source 


II Source 


III Source 


1 Source 


II Source 


III Source 


A/Ox 


29.25 


6.56 


21.07 


13.15 


3.80 


6.89 


20.38 


3.74 


23.74 


Benzene 


0.468 


0.239 


0.515 


0.066 


0.264 


0.353 


0.521 


0.139 


0.450 


Toluene 


1.688 


0.952 


1.685 


0.794 


0.409 


0.402 


1.223 


0.943 


1.570 


CO 


0.121 


0.045 


0.110 


0.054 


0.066 


0.044 


0.086 


0.049 


0.726 


PMW 


3.91 


23.06 


4.96 


3.35 


25.45 


2.57 


3.29 


24.46 


1.77 




Altamura, prov. of Bari (urban site) 


Andria, prov. of Bari (urban site) 


Monopoly prov. of Bari (urban site) 




1 Source 


II Source 


III Source 


1 Source 


II Source 


III Source 


1 Source 


II Source 


III Source 


A/Ox 


12.40 


2.28 


8.35 


25.87 


1.26 


13.56 


18.02 


2.71 


18.05 


Benzene 


0.100 


0.056 


0.368 


0.691 


0.122 


0.246 


0.848 


0.278 


0.276 


Toluene 


0.278 


0.358 


1.466 


1.501 


0.720 


1.277 


1.649 


0.525 


1.392 


CO 


0.116 


0.039 


0.081 


0.340 


0.014 


0.133 


0.187 


0.033 


0.060 


PMW 


1.19 


40.45 


6.92 


1.74 


17.69 


1.07 


3.21 


38.12 


1.29 



Parameter's unit is ug/m . 



from workplace. In Figure 7 the 2005 seasonal trend for 
viale Archimede monitoring site is shown as example: all 
sites have shown similar trend. 

Table 5 shows the coefficients of correlation among 
the six sites of the three sources in the APCS profiles 
matrix. According to this data, we can observe that the 
source Particulate shows high correlation among four 
sites of different zones (Bari and Province). This makes 
our hypothesis of a regional character for PM 10 con- 
centrations [37]; Monopoli and San Nicola sites don't 
show correlation and this can be explained considering 
the different nature of these sites: Monopoli is a urban 
sites while San Nicola is a suburban site skirting by 



high vehicular traffic street and whit high vehicular 
traffic spot during sport events (generally in the week 
end). 

On the contrary, considering the vehicular traffic 
sources its possible to observe low correlation among 
the sites due to different location of the sampling sites. 

Table 6 shows the reconstruction percentage error of 
the APCS model for each parameter. The error shows 
high variability over the range of the period. PM 10 con- 
centrations have shown the lowest error of reconstruc- 
tion, while the CO concentrations the biggest ones. The 
model, in fact, suffers of low robustness when values are 
low (this is the case of carbon monoxide). 



Table 4 APCS Source's profiles for the data collected during the Winter season 2006 in all the monitoring stations 
investigated 



Viale Archimede, Bari (urban site) San Nicola sport stadium, Bari (suburban site) Viale King, Bari (urban site) 





1 Source 


II Source 


III Source 


1 Source 


II Source 


III Source 


1 Source 


II Source 


III Source 


A/Ox 


35.52 


15.65 


21.65 


20.74 


4.13 


12.84 


62.46 


21.17 


18.41 


Benzene 


0.537 


1.074 


0.391 


0.178 


0.706 


0.729 


0.866 


1.356 


0.822 


Toluene 


1.943 


2.726 


1.192 


0.647 


1.038 


0.663 


2.436 


3.244 


3.622 


CO 


0.204 


0.114 


0.080 


0.108 


0.135 


0.123 


0.339 


0.326 


0.159 


PMW 


1.38 


23.93 


1.03 


1.83 


19.68 


4.64 


2.76 


29.67 


2.62 




Altamura, prov. of Bari (urban site) 


Andria, prov. of Bari (urban site) 


Monopoli, prov. of Bari (urban site) 




1 Source 


II Source 


III Source 


1 Source 


II Source 


III Source 


1 Source 


II Source 


III Source 


NOx 


46.16 


22.56 


10.10 


50.30 


9.36 


10.65 


34.09 


3.55 


15.75 


Benzene 


0.827 


0.852 


0.948 


0.496 


0.514 


0.368 


0.689 


0.637 


0.427 


Toluene 


6.041 


8.426 


3.781 


1.616 


1.730 


1.573 


1.752 


1.471 


3.063 


CO 


0.347 


0.108 


0.077 


0.538 


0.106 


0.114 


0.270 


0.080 


0.116 


PMW 


1.47 


21.93 


1.34 


1.70 


23.54 


2.38 


2.20 


23.89 


2.50 



Parameter's unit is ug/m 3 . 
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Figure 6 Average percentage of the parameters distribution in 
the three sources. 



Anyway, in most of the cases the error was acceptable, 
allowing a fairly good reconstruction of the concentra- 
tion trend. 

Experimental 

The air quality monitoring network 

Bari is a town of about 350000 inhabitants located in 
South-East of Italy (latitude 41°08; longitude 16°45'). Its 
greater industrial activities are in mechanical (carpentry 
and industrial vehicles), food and clothing sectors; its in- 
dustrial area, whit a thermo electrical power station, is 
placed in the neighbouring towns. 



Prevailing winds are from NNW and WNW in December, 
January and February, from East in March and September 
and from NNE and South in October and November. 
Raining days are 80 - 90 for year with maxima 40 - 
50 mm. The region is characterized by an active photo- 
chemistry mostly in the summer season. 

Like many other Italian cities, its urban area is charac- 
terized by high motor-vehicle traffic density, mostly in 
the centre of the city. 

The air quality monitoring network of the Bari Municipal- 
ity is composed by six fixed monitoring stations, by a mobile 
laboratory and a data elaboration centre. In province of Bari, 
that extends for 3.825 km 2 and includes 41 towns, there 
are four fixed monitoring stations located in the towns of 
Casamassima, Altamura, Andria and Monopoli. 

In this paper some stations of Bari and its province 
monitoring networks have been selected as representa- 
tive sites of the investigated area. In Bari, the selected 
monitoring stations are located in residential area (viale 
King), in urban area (viale Archimede) and in a subur- 
ban area (S. Nicola sport stadium). 

In province of Bari, the three selected stations are lo- 
cated in the urban and residential areas of the following 
towns: Altamura (67000 inhabitants) located at 47 Km 
south-westwards from Bari, Andria (98000 in.) at 55 Km 
northwards from Bari and Monopoli (50000 in.) a 
coastal town at 40 Km southwards from Bari. 
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Figure 7 Daily patterns for seasonal bihourly mean concentration contributions of the three sources (x axes is referred to local time: 
winter local time: GMT + 1 h; summer local time: GMT + 2 h) for the data collected in viale Archimede during spring (a), summer 
(b), autumn (c) and winter (d) 2005. 
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Table 5 Matrix of correlation coefficients of the three sources obtained by APCS sources profiles 



Mixed source 





Altamura Andria 


Monopoli Archimede 


King 


San Nicola 


Altamura 


1 .00 0.07 


-0.04 0.25 


-0.44 


0.35 


Andria 


1.00 


-0.32 -0.02 


0.44 


-0.23 


Monopoli 




1 .00 -0.06 


0.05 


0.32 


Archimede 




1.00 


-0.19 


0.25 


King 






1.00 


-0.26 


San Nicola 








1.00 






Particulate 








Altamura Andria 


Monopoli Archimede 


King 


San Nicola 


Altamura 


1 .00 0.65 


-0.12 0.39 


0.59 


-0.12 


Andria 


1.00 


0.10 0.56 


0.71 


0.13 


Monopoli 




1 .00 0.54 


0.05 


0.24 


Archimede 




1.00 


0.51 


0.09 


King 






1.00 


-0.01 


San Nicola 








1.00 






Vehicular traffic 








Altamura Andria 


Monopoli Archimede 


King 


San Nicola 


Altamura 


1 .00 0.03 


0.16 0.06 


-0.07 


-0.17 


Andria 


1.00 


-0.29 0.36 


0.33 


0.19 


Monopoli 




1.00 -0.12 


0.60 


0.20 


Archimede 




1.00 


0.51 


0.46 


King 






1.00 


0.72 


San Nicola 








1.00 


Values higher than 0.5 are highlighted in bold text. 








All considered sites can be classified as urban back- Diego CA USA), PM 10 (Opsis model SM 200, Furulund, 
ground sites, except for Monopoli that is a urban site Sweden and MP 100, Environnement, France) and with 
and San Nicola that is a suburban site skirting by high meteorological sensors such as temperature, barometric 
vehicular traffic street and whit high vehicular traffic pressure, relative humidity, solar radiation, speed and 
spots during sport events. direction of wind and rain. 

Nitrogen oxides, NO and N0 2 , were analysed using 
the chemiluminescence method. Measurement of ozone 
The instrumentation is based upon the capacity of such gas to absorb ultra- 
Each station is provided with automatic analysers of CO violet rays with opportune wavelengths, generated by 
(Advanced pollution Instrumentation model 300E, San built-in lamp. Carbon monoxide is analysed through the 
Diego CA USA), BTX (model Syntech Spectras GC 855, absorption of infrared rays (IR). 

Groningen, Netherlands), 0 3 (Advanced pollution In- The measuring of PM 10 is based upon the beta ray at- 
strumentation model 400E, San Diego CA USA), NO x tenuation method on standard 47 mm membrane filters; 
(Advanced pollution Instrumentation model 200A, San the data are bihourly collected. 


Table 6 Percentage error of reconstruction for each parameter for each site 








Altamura Andria 


Monopoli Japigia San Nicola 


King 


Range 


NOx 


11.13 6.99 


11.22 18.16 12.14 


8.81 


6.99-18.16 


Benzene 


30.55 27.27 


28.06 29.16 36.35 


37.26 


27.27-37.26 


Toluene 


9.94 1 1 .44 


8.66 21.50 15.33 


11.12 


8.66-21.50 


CO 


39.05 28.17 


39.51 42.24 55.18 


41.77 


28.17-55.18 


PM10 


2.74 2.62 


1.60 3.67 5.78 


3.35 


1.60-5.78 
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Benzene/Toluene/Xylene are measured using the capil- 
lary gas chromatographic technique in the gaseous phase, 
which enables the rapid separation and identification 
(15 minutes) of the components of the gas sample. 

The data 

The data are collected by the system every hour for all 
parameters, except for PM 10 that are collected every two 
hours. Therefore, all data are considered with means 
every two hours (even hours). 

In order to simplify the further statistical elaborations, 
only days with complete data, that is days with all 12 bi- 
hourly means were considered for data set. 

The data collected by the monitoring network was vali- 
dated according to this protocol: a preliminary validation 
was carried out by the software, which has invalidated all 
data occurred in calibration hours, and data identified 
as artifacts; then, a manual calibration was carried out 
by operators, considering the relations existing among 
the several parameters: for example, the validation of 
parameters monitored by the same instrument (i.e. ben- 
zene and toluene, or the nitrogen oxides), was carried 
out simultaneously, like so for parameters linked by the 
same hypothetical source (i.e. carbon oxide and aro- 
matic compounds, typical traffic pollutants). In this way 
it is possible to verify that eventual critical data are related 
to real pollution situations, and they are not artifacts due 
to instrument malfunction. Moreover, meteorological data 
(rain, speed and direction wind) were used to investigate 
about the influence of natural events on high or low con- 
centration situation. 

The data have been collected during the period of time 
from January 2005 to May 2007 in the investigated sites. 

In the Table 1 the basic statistics for each site have 
been summarized. 

Conclusions 

Multivariate statistical techniques such as receptor models 
offer a valid tool to handle complex data sets and allow to 
extract information not directly inferable from original 
data matrix by traditional approach. 

In our case the model suggests that the major amount 
of PM 10 isn't linked directly to the vehicular traffic. Its 
probably due to PM 10 long and medium range transport 
and due to formation of secondary particulate. The model 
confirms a common regional contribution to PM 10 among 
sites and the absence of PM 10 seasonal trend observed. 

Even if the model is applied to few parameters, it is 
able to suggest information about the nature of the pol- 
lutions sources. However for the determination of the 
other important pollution sources, such as domestic 
heating, its needed to obtain parameters that allow to 
identify this source. 



The results obtained by the models moreover confirm 
that PM 10 concentration cannot be considered a good air 
quality indicator because it don't reflect the real pollution s 
sources. 

Methods 

The model description 

The aim of the application of the receptor models is the 
apportionment of the pollutant s sources. The two main 
approaches of receptor models are Chemical Mass Bal- 
ance (CMB) and multivariate factor analysis (FA). CMB 
gives the most objective source apportionment and it 
needs only one sample; however, it assumes knowledge 
of the number of sources and their emission pattern. On 
the other hand, FA attempts to apportion the sources 
and to determine their composition on the basis of a 
series of observations at the receptor site only [40]. 
Among multivariate techniques, Principal Component 
Analysis (PC A) is often used as an exploratory tool to 
identify the major sources of air pollutant emissions 
[38,41-43]. The great advantage of using PCA as a re- 
ceptor model is that there is no need for a priori know- 
ledge of emission inventories [44]. 

PCA is a statistical method that identifies patterns in 
data, revealing their similarities and differences [45]. 
PCA creates new variables, the principal components 
scores (PCS), that are orthogonal and uncorrelated to 
each other, being linear combinations of the original var- 
iables. They are obtained in such a way that the first PC 
explains the largest fraction of the original data variabil- 
ity, the second PC explains a smaller fraction of the data 
variance than the first one and so forth [46-48]. Varimax 
rotation is the most widely employed orthogonal rota- 
tion in PCA, because it tends to produce simplification 
of the unrotated loadings to easier interpretation of the 
results. It simplifies the loadings by rigidly rotating the 
PC axes such that the variable projections (loadings) on 
each PC tend to be high or low. 

Moreover the reconstruction of the source profile and 
contribution matrices can be successfully obtained by APCS 
(Absolute Principal Component Scores) method [49]. 

The observed pollutant concentration in the atmosphere 
at a certain time Q can be considered as a linear combin- 
ation of contributions from p sources: 



Q = y^atkSk ' ' 

K=l 

where S k is the contribution from each source and a ik is 
the fraction of source k contribution possessing property 
i at the receptor. 

One of the most used methods to decompose the con- 
centration matrix in the product of the source pattern 
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and contribution matrices is the APCS. The starting 
point is the matrix X (samples x parameters). In the 
APCS method the first step is the search of the Eigen- 
values and Eigenvectors of the data correlation matrix 
G. Only the most significant p Eigenvectors (or factors) 
are taken into account. Generally two methods are used 
in order to choose p Eigenvectors: Kaiser method. 

Eigenvectors: Kaiser method (PCs with eigenvalues 
greater than 1) and ODV80 ones (PCs representing at 
least 80% of the original data variance). 

The p Eigenvectors are then rotated by an orthogonal 
or oblique rotation. The most used rotation algorithm is 
Varimax, which performs orthogonal rotation of the 
loadings. After the rotation all the components should 
assume positive values; small negative values are set 
zero. An abstract image of the source contributions to 
the samples can be obtained by multivariate linear 
regression: 



Z = PCS*V T 1 } 

where Z is the scaled data matrix, PCS is the principal 
component scores matrix, and V T is the transposed ro- 
tated loading (Eigenvectors) matrix. 

In order to pass from the abstract contributions to real 
ones, a fictitious sample Z 0 , where all concentrations are 
zero, is built [43,50]. Details about the method can be 
found in the reference 49: the APCS matrix can be iden- 
tified with the estimated contributions matrix F r . A re- 
gression on the data matrix X allows to obtain the 
estimated source profiles matrix A r . At last the product 
of the matrices F r and A r allows to recalculate the data 
matrix X r (reconstructed data matrix). The reconstruc- 
tion percentage error of the model has been calculated 
as percent relative root mean square errors (RRMSE) as 
shown in reference [49]. 
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